Skip to content

SDCICD-1735: update krknai prompt and markdown-to-HTML conversion#3128

Merged
ritmun merged 2 commits intoopenshift:mainfrom
minlei98:SDCICD-1735
Mar 4, 2026
Merged

SDCICD-1735: update krknai prompt and markdown-to-HTML conversion#3128
ritmun merged 2 commits intoopenshift:mainfrom
minlei98:SDCICD-1735

Conversation

@minlei98
Copy link
Contributor

@minlei98 minlei98 commented Feb 26, 2026

  1. Update Krknai.yaml prompt
  2. Add MergeTemplates to PromptStore for loading package-local templates on top of defaults.
  3. Add markdown-to-HTML conversion for --report-format=html in the krkn-ai engine
  4. keeping the default osde2e analyzer unchanged.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added support for multiple report output formats: JSON, Markdown, and HTML
    • Enhanced cluster metadata tracking (type, region, environment, version, ID) integrated into reports
    • Improved Markdown report templates with comprehensive analysis sections and structured output
  • Tests

    • Updated test suite for new report format functionality and cluster metadata support

@openshift-ci-robot
Copy link

There are test jobs defined for this repository which are not configured to run automatically. Comment /test ? to see a list of all defined jobs. Review these jobs and use /test <job> to manually trigger jobs most likely to be impacted by the proposed changes.Comment /pipeline required to trigger all required & necessary jobs.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Feb 26, 2026
@openshift-ci-robot
Copy link

openshift-ci-robot commented Feb 26, 2026

@minlei98: This pull request references SDCICD-1735 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the sub-task to target the "4.22.0" version, but no target version was set.

Details

In response to this:

  1. Update Krknai.yaml prompt
  2. Add MergeTemplates to PromptStore for loading package-local templates on top of defaults.
  3. Add markdown-to-HTML conversion for --report-format=html in the krkn-ai engine
  4. keeping the default osde2e analyzer unchanged.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@minlei98
Copy link
Contributor Author

krknai-22report.html
Report with cluster info

@minlei98 minlei98 force-pushed the SDCICD-1735 branch 2 times, most recently from 22b5db7 to f5cd52d Compare February 26, 2026 21:04
func markdownToHTML(content string) string {
htmlTemplate, err := krknPrompts.ReadFile(htmlTemplatePath)
if err != nil {
return content
Copy link
Contributor

@ritmun ritmun Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this return or log an error?

## Test Configuration (GA params, scenarios, health check targets)
## Run Statistics (table: totals, generations, fitness scores, types)
## Genetic Algorithm Evolution (fitness trends, convergence, most disruptive generation)
## Top Vulnerabilities (top 3-5 by fitness: target, impact, severity [Critical/High/Medium/Low], why it matters)
Copy link
Contributor

@ritmun ritmun Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to include information about type of node (eg infra) in case of "Node CPU Hog" vulnerability?

@iamkirkbater
Copy link
Contributor

The generated HTML file is really nice. Super easy to read, straight to the point. Almost doesn't even read like it's generated by AI :D

The only thing I could potentially ask to add would be that somewhere in that file there's a link, or a reference to how I would figure out where the must-gather is for that run. Specifically - in the generated HTML document it references a node that caused the highest fitness score. However, without knowing any more details of that node I don't know what to do with that information. I assume that it was an infra node, but without being able to know what workloads were on it I can't really do much with the information this report generated.

That's my only "problem" with this - and I use the term "problem" very loosely. This is amazing!

@ritmun
Copy link
Contributor

ritmun commented Mar 2, 2026

Specifically - in the generated HTML document it references a node that caused the highest fitness score. However, without knowing any more details of that node I don't know what to do with that information

Thanks for the suggestion! Created card https://issues.redhat.com/browse/SDCICD-1766

@ritmun
Copy link
Contributor

ritmun commented Mar 2, 2026

krknai-22report.html Report with cluster info

  1. Can we add environment and remove cluster name?
  2. Cluster Type can combine - cloud provider (AWS/GCP), ocp type (ROSA/OSD) and hypershift (if hcp)

@minlei98
Copy link
Contributor Author

minlei98 commented Mar 2, 2026

krknai-49report.html
Here is the latest report with last week's test result.

@ritmun
Copy link
Contributor

ritmun commented Mar 2, 2026

Can we increase the font size for this highlighted code style to normal size?
image

@minlei98
Copy link
Contributor Author

minlei98 commented Mar 3, 2026

krknai-44report.html
report with node info and increased font.

@coderabbitai
Copy link

coderabbitai bot commented Mar 3, 2026

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 2d2a3c43-2145-49ea-a5cd-be8f11e95b16

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Walkthrough

This pull request adds cluster metadata support and HTML report generation capabilities to the KrknAI analysis engine. It introduces ClusterInfo data structures across the aggregator and engine packages, refactors prompt loading through a central template registry, adds HTML output formatting with markdown-to-HTML conversion, and updates the prompt template with a Markdown-focused report structure.

Changes

Cohort / File(s) Summary
Dependency Management
go.mod
Added indirect dependency github.com/gomarkdown/markdown for markdown-to-HTML conversion support.
Core Data Models
pkg/krknai/aggregator/aggregator.go
Introduced new public ClusterInfo struct with ID, Version, Type, Region, and Environment fields; added clusterInfo field to KrknAIAggregator and ClusterInfo field to KrknAIData; added WithClusterInfo() method to propagate cluster metadata.
Analysis Engine Infrastructure
internal/analysisengine/engine.go, internal/prompts/prompts.go, internal/prompts/prompts_test.go
Added Type and Hypershift fields to ClusterInfo struct in analysis engine; introduced RegisterTemplates() public method on PromptStore for dynamic template registration; added test assertion for default template retrieval.
Primary Engine Implementation
pkg/krknai/analysisengine/engine.go
Major refactor of prompt loading from embedded template to dynamic registration via PromptStore; added ReportFormat config field supporting "json", "markdown", and "html" output; implemented markdownToHTML() helper for HTML conversion; added WithClusterInfo() method; enhanced template variable propagation with cluster info; integrated HTML rendering into result content generation; updated error messages and imports for new markdown/template libraries.
Engine Tests
pkg/krknai/analysisengine/engine_test.go
Updated import alias from krknAggregator to krknAgg; refactored tests to use centralized PromptStore with dynamic template registration; renamed test function and updated assertions for new template access patterns; extended test variables with ClusterInfo, Summary, TopScenarios; added validation for markdown and html output formats.
Prompt Templates
pkg/krknai/analysisengine/prompts/krknai.yaml, pkg/krknai/analysisengine/prompts/report.html
Completely redesigned prompt template to shift from JSON-structured output to comprehensive Markdown report with Executive Summary, Cluster Under Test, Test Configuration, and multi-section analysis; added variables block documenting ClusterInfo, Summary, TopScenarios, and other data structures; introduced new HTML report template with CSS styling and content placeholder for server-side rendering.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Engine
    participant Aggregator
    participant PromptStore
    participant MarkdownRenderer
    participant HTMLTemplate

    Client->>Engine: WithClusterInfo(info)
    Engine->>Aggregator: WithClusterInfo(info)
    Aggregator->>Aggregator: Store clusterInfo

    Client->>Engine: Run(ctx, config)
    Engine->>Aggregator: Collect()
    Aggregator->>Aggregator: Initialize KrknAIData with ClusterInfo
    Aggregator-->>Engine: Return KrknAIData with cluster metadata

    Engine->>PromptStore: LoadTemplates
    PromptStore-->>Engine: Return prompt + config

    Engine->>Engine: RenderPrompt(variables)
    Note over Engine: variables include ClusterInfo, Summary, TopScenarios
    Engine-->>Engine: Get rendered prompt content

    alt ReportFormat == "html"
        Engine->>MarkdownRenderer: Convert markdown content to HTML
        MarkdownRenderer-->>Engine: Return HTML string
        Engine->>HTMLTemplate: Render(report.html, htmlContent)
        HTMLTemplate-->>Engine: Return final HTML report
    else ReportFormat == "json" or "markdown"
        Engine-->>Engine: Use content as-is
    end

    Engine-->>Client: Return result with formatted content
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately summarizes the main changes: updating the krknai prompt template and adding markdown-to-HTML conversion functionality.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Stable And Deterministic Test Names ✅ Passed All test names are stable and deterministic with no dynamic elements like timestamps, UUIDs, or pod names.
Test Structure And Quality ✅ Passed Test code demonstrates good quality with focused test functions, proper setup/cleanup patterns using helper functions and defer statements, and appropriate timeout handling for unit tests.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 3, 2026

@minlei98: This pull request references SDCICD-1735 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the sub-task to target the "4.22.0" version, but no target version was set.

Details

In response to this:

  1. Update Krknai.yaml prompt
  2. Add MergeTemplates to PromptStore for loading package-local templates on top of defaults.
  3. Add markdown-to-HTML conversion for --report-format=html in the krkn-ai engine
  4. keeping the default osde2e analyzer unchanged.

Summary by CodeRabbit

Release Notes

  • New Features

  • Added support for multiple report output formats: JSON, Markdown, and HTML

  • Enhanced cluster metadata tracking (type, region, environment, version, ID) integrated into reports

  • Improved Markdown report templates with comprehensive analysis sections and structured output

  • Tests

  • Updated test suite for new report format functionality and cluster metadata support

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
pkg/krknai/analysisengine/engine.go (1)

254-260: ⚠️ Potential issue | 🔴 Critical

Sanitize rendered markdown before casting to template.HTML.

Line 259 marks renderer output as trusted HTML. With LLM/log-derived markdown, this enables HTML/script injection in the generated report.

🔒 Suggested hardening
 import (
 	"bytes"
 	"context"
 	"embed"
 	"fmt"
 	"html/template"
 	"io/fs"
 	"os"
 	"path/filepath"
 	"time"
 
 	"github.com/gomarkdown/markdown"
 	mdhtml "github.com/gomarkdown/markdown/html"
 	"github.com/gomarkdown/markdown/parser"
+	"github.com/microcosm-cc/bluemonday"
 	"github.com/openshift/osde2e/internal/analysisengine"
 	"github.com/openshift/osde2e/internal/llm"
 	"github.com/openshift/osde2e/internal/llm/tools"
 	"github.com/openshift/osde2e/internal/prompts"
@@
-	body := markdown.ToHTML([]byte(content), p, renderer)
+	unsafeBody := markdown.ToHTML([]byte(content), p, renderer)
+	safeBody := bluemonday.UGCPolicy().SanitizeBytes(unsafeBody)
@@
-	if err := tmpl.Execute(&buf, struct{ Body template.HTML }{Body: template.HTML(body)}); err != nil {
+	if err := tmpl.Execute(&buf, struct{ Body template.HTML }{Body: template.HTML(string(safeBody))}); err != nil {
 		return "", fmt.Errorf("failed to execute HTML template: %w", err)
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/krknai/analysisengine/engine.go` around lines 254 - 260, The rendered
markdown byte slice stored in variable body (from markdown.ToHTML called
alongside parser.NewWithExtensions and mdhtml.NewRenderer) is currently cast
directly to template.HTML before tmpl.Execute, allowing untrusted HTML/script
injection; fix this by sanitizing the renderer output before casting: add a
trusted sanitizer (e.g., bluemonday.StrictPolicy or a policy that allows safe
markdown tags) and run it on body (or string(body)), then cast the sanitized
result to template.HTML when building the struct passed to tmpl.Execute; ensure
imports are added and replace the direct template.HTML(body) usage with
template.HTML(sanitizedBody).
🧹 Nitpick comments (3)
internal/prompts/prompts_test.go (1)

10-18: Add a focused test for RegisterTemplates overwrite behavior.

Good assertion for default availability. Since this PR adds template overlay semantics, add one test that registers a same-ID template and verifies the default gets replaced.

🧪 Proposed test addition
 import (
 	"testing"
+	"testing/fstest"

 	"github.com/stretchr/testify/assert"
 	"github.com/stretchr/testify/require"
 )

+func TestRegisterTemplates_OverridesExistingTemplate(t *testing.T) {
+	store, err := NewPromptStore(DefaultTemplates())
+	require.NoError(t, err)
+
+	overrideFS := fstest.MapFS{
+		"default.yaml": &fstest.MapFile{
+			Data: []byte("system_prompt: override-system\nuser_prompt: override-user\n"),
+		},
+	}
+
+	require.NoError(t, store.RegisterTemplates(overrideFS))
+
+	tmpl, err := store.GetTemplate("default")
+	require.NoError(t, err)
+	assert.Equal(t, "override-system", tmpl.SystemPrompt)
+	assert.Equal(t, "override-user", tmpl.UserPrompt)
+}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/prompts/prompts_test.go` around lines 10 - 18, Add a focused unit
test that verifies RegisterTemplates overwrites existing templates: create a
PromptStore via NewPromptStore(DefaultTemplates()), then call
store.RegisterTemplates(...) with a template object using the same ID (e.g.,
"default") but different content, retrieve it via store.GetTemplate("default")
and assert the template's content now matches the newly registered one (and not
the original). Reference NewPromptStore, DefaultTemplates, RegisterTemplates,
and GetTemplate in the test to locate the code paths to exercise.
pkg/krknai/analysisengine/prompts/krknai.yaml (1)

17-17: Reduce hallucination risk for expected status codes.

You require expected status codes in output; make the artifact-read instruction explicit for that field so the model doesn’t infer missing values.

✍️ Suggested prompt tweak
-  ## Test Configuration (GA params; list all enabled chaos scenarios; health check targets with name, endpoint URL, and expected status code)
+  ## Test Configuration (GA params; list all enabled chaos scenarios; health check targets with name, endpoint URL, and expected status code; fetch from artifacts when not present in prompt variables)

@@
-  Use read_file on relevant artifacts. Generate the full markdown report per system prompt structure.
+  Use read_file on relevant artifacts (especially krkn-ai.yaml for health-check expected status codes). Generate the full markdown report per system prompt structure.

Also applies to: 63-63

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/krknai/analysisengine/prompts/krknai.yaml` at line 17, The prompt's "##
Test Configuration" section currently asks for expected status codes but doesn't
force the model to read the artifact; update the krknai.yaml prompt to add an
explicit artifact-read requirement for the expected status code field (e.g.,
require "expected status code" or "expected_status_code" be populated only from
the provided artifact and never guessed), and mirror this change for the other
occurrence around line 63 so the model must extract that value from the artifact
rather than infer or fabricate it.
pkg/krknai/aggregator/aggregator.go (1)

103-107: Defensively copy ClusterInfo to avoid pointer aliasing side effects.

WithClusterInfo and Collect currently share the same pointer instance. A caller mutating info later can unintentionally change previously collected output.

♻️ Proposed defensive-copy update
 func (a *KrknAIAggregator) WithClusterInfo(info *ClusterInfo) *KrknAIAggregator {
-	a.clusterInfo = info
+	if info == nil {
+		a.clusterInfo = nil
+		return a
+	}
+	copied := *info
+	a.clusterInfo = &copied
 	return a
 }
 
 func (a *KrknAIAggregator) Collect(ctx context.Context, resultsDir string) (*KrknAIData, error) {
@@
-	data := &KrknAIData{
-		ClusterInfo: a.clusterInfo,
-	}
+	data := &KrknAIData{}
+	if a.clusterInfo != nil {
+		copied := *a.clusterInfo
+		data.ClusterInfo = &copied
+	}

Also applies to: 117-119

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/krknai/aggregator/aggregator.go` around lines 103 - 107, WithClusterInfo
currently stores the provided *ClusterInfo pointer directly causing aliasing;
change it to store a defensive copy instead (e.g., implement or call a
ClusterInfo.Clone/Copy and assign that to KrknAIAggregator.clusterInfo) so later
mutations by the caller don't change stored data; also ensure any use in Collect
reads from that copied instance (or makes another copy before including in
output) to avoid sharing the original pointer.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/krknai/analysisengine/engine.go`:
- Around line 156-162: Validate ReportFormat explicitly before conversion: if
e.config.ReportFormat is "html" call markdownToHTML(content) as you already do,
but add a guard such as if e.config.ReportFormat != "" && e.config.ReportFormat
!= "html" { return nil, fmt.Errorf("unsupported ReportFormat: %q",
e.config.ReportFormat) } so typos or invalid values on e.config.ReportFormat are
rejected instead of silently falling back to raw; keep the markdownToHTML call
and its error handling unchanged.

---

Duplicate comments:
In `@pkg/krknai/analysisengine/engine.go`:
- Around line 254-260: The rendered markdown byte slice stored in variable body
(from markdown.ToHTML called alongside parser.NewWithExtensions and
mdhtml.NewRenderer) is currently cast directly to template.HTML before
tmpl.Execute, allowing untrusted HTML/script injection; fix this by sanitizing
the renderer output before casting: add a trusted sanitizer (e.g.,
bluemonday.StrictPolicy or a policy that allows safe markdown tags) and run it
on body (or string(body)), then cast the sanitized result to template.HTML when
building the struct passed to tmpl.Execute; ensure imports are added and replace
the direct template.HTML(body) usage with template.HTML(sanitizedBody).

---

Nitpick comments:
In `@internal/prompts/prompts_test.go`:
- Around line 10-18: Add a focused unit test that verifies RegisterTemplates
overwrites existing templates: create a PromptStore via
NewPromptStore(DefaultTemplates()), then call store.RegisterTemplates(...) with
a template object using the same ID (e.g., "default") but different content,
retrieve it via store.GetTemplate("default") and assert the template's content
now matches the newly registered one (and not the original). Reference
NewPromptStore, DefaultTemplates, RegisterTemplates, and GetTemplate in the test
to locate the code paths to exercise.

In `@pkg/krknai/aggregator/aggregator.go`:
- Around line 103-107: WithClusterInfo currently stores the provided
*ClusterInfo pointer directly causing aliasing; change it to store a defensive
copy instead (e.g., implement or call a ClusterInfo.Clone/Copy and assign that
to KrknAIAggregator.clusterInfo) so later mutations by the caller don't change
stored data; also ensure any use in Collect reads from that copied instance (or
makes another copy before including in output) to avoid sharing the original
pointer.

In `@pkg/krknai/analysisengine/prompts/krknai.yaml`:
- Line 17: The prompt's "## Test Configuration" section currently asks for
expected status codes but doesn't force the model to read the artifact; update
the krknai.yaml prompt to add an explicit artifact-read requirement for the
expected status code field (e.g., require "expected status code" or
"expected_status_code" be populated only from the provided artifact and never
guessed), and mirror this change for the other occurrence around line 63 so the
model must extract that value from the artifact rather than infer or fabricate
it.

ℹ️ Review info

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between c2744e8 and 950cd6a.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (9)
  • go.mod
  • internal/analysisengine/engine.go
  • internal/prompts/prompts.go
  • internal/prompts/prompts_test.go
  • pkg/krknai/aggregator/aggregator.go
  • pkg/krknai/analysisengine/engine.go
  • pkg/krknai/analysisengine/engine_test.go
  • pkg/krknai/analysisengine/prompts/krknai.yaml
  • pkg/krknai/analysisengine/prompts/report.html

@ritmun
Copy link
Contributor

ritmun commented Mar 4, 2026

/lgtm
/approve

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 4, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 4, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: minlei98, ritmun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 4, 2026
@ritmun
Copy link
Contributor

ritmun commented Mar 4, 2026

/override ci/prow/hypershift-pr-check

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 4, 2026

@ritmun: Overrode contexts on behalf of ritmun: ci/prow/hypershift-pr-check

Details

In response to this:

/override ci/prow/hypershift-pr-check

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 4, 2026

@minlei98: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@ritmun ritmun merged commit 25afd3e into openshift:main Mar 4, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants